Kernel-based methods for bandit convex optimization

机译：基于核的匪凸优化方法

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We consider the adversarial convex bandit problem and we build the first$\mathrm{poly}(T)$-time algorithm with $\mathrm{poly}(n) \sqrt{T}$-regret forthis problem. To do so we introduce three new ideas in the derivative-freeoptimization literature: (i) kernel methods, (ii) a generalization of Bernoulliconvolutions, and (iii) a new annealing schedule for exponential weights (withincreasing learning rate). The basic version of our algorithm achieves$\tilde{O}(n^{9.5} \sqrt{T})$-regret, and we show that a simple variant of thisalgorithm can be run in $\mathrm{poly}(n \log(T))$-time per step at the cost ofan additional $\mathrm{poly}(n) T^{o(1)}$ factor in the regret. These resultsimprove upon the $\tilde{O}(n^{11} \sqrt{T})$-regret and$\exp(\mathrm{poly}(T))$-time result of the first two authors, and the$\log(T)^{\mathrm{poly}(n)} \sqrt{T}$-regret and$\log(T)^{\mathrm{poly}(n)}$-time result of Hazan and Li. Furthermore weconjecture that another variant of the algorithm could achieve$\tilde{O}(n^{1.5} \sqrt{T})$-regret, and moreover that this regret isunimprovable (the current best lower bound being $\Omega(n \sqrt{T})$ and it isachieved with linear functions). For the simpler situation of zeroth orderstochastic convex optimization this corresponds to the conjecture that theoptimal query complexity is of order $n^3 / \epsilon^2$.

机译：我们考虑对抗性凸匪问题，并为此问题构建了第一个$ \ mathrm {poly}（T）$时间算法，并使用$ \ mathrm {poly}（n）\ sqrt {T} $-regret。为此，我们在无导数最优化文献中引入了三个新思想：（i）核方法，（ii）Bernoulliconvolutions的一般化，以及（iii）指数权重的新退火时间表（提高学习率）。我们算法的基本版本实现了$ \ tilde {O}（n ^ {9.5} \ sqrt {T}）$后悔，我们证明了该算法的简单变体可以在$ \ mathrm {poly}（n每步时间\ log（T））$的时间，但要为此付出额外的$ \ mathrm {poly}（n）T ^ {o（1）} $的代价。这些结果改进了前两位作者的$ \ tilde {O}（n ^ {11} \ sqrt {T}）$后悔和$ \ exp（\ mathrm {poly}（T））$时间的结果，以及$ \ log（T）^ {\ mathrm {poly}（n）} \ sqrt {T} $-regret和$ \ log（T）^ {\ mathrm {poly}（n）} $的Hazan时间结果和李此外，我们推测该算法的另一个变体可以实现$ \ tilde {O}（n ^ {1.5} \ sqrt {T}）$-regret，此外，这种遗憾是无法弥补的（当前的最佳下限是$ \ Omega（n \ sqrt {T}）$，并且可以通过线性函数实现）。对于零阶随机凸优化的较简单情况，这对应于最佳查询复杂度为$ n ^ 3 / \ epsilon ^ 2 $的猜想。

著录项

作者
Bubeck, Sébastien; Eldan, Ronen; Lee, Yin Tat;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case [J] . Gasnikov A. V., Krymova E. A., Lagunovskaya A. A., Automation and Remote Control . 2017,第2期

机译：随机在线优化。单点和多点非线性多武装匪徒。凸和强凸案
2. Bandit Convex Optimization in Non-stationary Environments [J] . Peng Zhao, Guanghui Wang, Lijun Zhang, Journal of machine learning research . 2021,第a期

机译：非静止环境中的强盗凸优化
3. Zeroth Order Non-convex optimization with Dueling-Choice Bandits [J] . Yichong Xu, Aparna Joshi, Aarti Singh, JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：Zeroth命令与Dueling-Choice Bartits的非凸优化
4. Implementation of Infeasible Kernel-Based Interior-Point Methods for Linearly Constrained Convex Optimization [C] . L. Wu, K. Li, X. D. Wang, International Conference on Electrical, Automation and Mechanical Engineering . 2015

机译：用于线性约束凸优化的可行基于内核的内部点方法的实现
5. Convex and Non-Convex Optimization Methods for Machine Learning [D] . ?Zohrizadeh, Fariba 2019

机译：用于机器学习的凸和非凸的优化方法
6. Optimized first-order methods for smooth convex minimization [O] . Donghwan Kim, Jeffrey A. Fessler -1

机译：优化的一阶方法用于平滑凸极小化
7. Implementation of Infeasible Kernel-Based Interior-Point Methods for Linearly Constrained Convex Optimization [O] . L. Wu, K. Li, X.D. Wang, 2015

机译：用于线性约束凸优化的可行基于内核的内部点方法的实现

Kernel-based methods for bandit convex optimization

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅